\section{Conclusions}
\label{sec:conclusions}

In this paper, we considered the problem of performing the Kolmogorov-Smirnov test on streaming data. We gave algorithms for both the one-sample and the two-sample versions of the test, along with guarantees of their performance. Our algorithms make use of the techniques for computing quantiles on a stream and hence may have improved results when further progress is made on this problem. Besides giving analytical guarantees, we show via experiments on real and synthetic data that the proposed algorithms are capable of performing the test with a two order magnitude reduction in the size of the data. Our experiments also showed that the Greenwald-Khanna sketch is best suited for our algorithm, and that it is considerably superior to other simple techniques such as sampling.

There are several open problems that still remain. 
The algorithms proposed in this paper need $O(\sqrt{n}\log{n})$ samples, and it would be interesting to see if more efficient algorithms are possible or if there is a matching lower bound. 
%While the algorithm proposed in this paper makes use of quantile sketches, and hence has the same space usage, it is unclear whether a testing algorithm exists that uses asymptotically less space than any quantile sketch. It would be interesting to find such an algorithm, or alternatively to prove that any quantile sketch must use as much space as any testing algorithm. 
% Other open question?
There are also many other statistical tests, both parametric and non-parametric, that do not have known streaming algorithms. Identifying which ones have sublinear space algorithms and developing these algorithms is an avenue for future work.


%\noindent{\bf Acknowledgement:} We thank anonymous reviewers for
%very helpful comments on the paper. And Erin. And Ryan.
